-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Fix MAU reaping where reserved users are specified. #6168
Conversation
for i in range(initial_users): | ||
user = "@user%d:server" % i | ||
email = "user%[email protected]" % i | ||
self.store.upsert_monthly_active_user(user) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.store.upsert_monthly_active_user(user) | |
self.get_success(self.store.upsert_monthly_active_user(user)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
etc for other store functions
query_args = [] | ||
query_args.extend(self.reserved_users) | ||
query_args.append(safe_guard) | ||
query_args.extend(self.reserved_users) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can write this as query_args = [*self.reserved_users, safe_guard, *self.reserved_users]
|
||
# Update reserved_users to ensure it stays in sync, this is important | ||
# for reaping. | ||
self.reserved_users = users |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels wrong, as there's nothing to say that get_registered_reserved_users_count
is going to get called to make this accurate. Can't we query the database to get the reserved_users
when we need it? (And potentially cache it?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, opted not to cache because method is called every hour and ought to be light.
else: | ||
safe_guard = max_mau_value - len(self.reserved_users) | ||
# Must be greater than zero for postgres | ||
safe_guard = safe_guard if safe_guard > 0 else 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this not just safe_guard = max(max_mau_value - len(self.reserved_users), 0)
. Also can we rename it to something like num_users_to_remove
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have renamed, but the sense is the opposite, because it is number of non reserved users to save.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple of formatting nits, otherwise good to go
",".join(questionmarks) | ||
) | ||
query_args.extend(reserved_users) | ||
sql = base_sql + """ AND user_id NOT IN ({})""".format(question_marks) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sql = base_sql + """ AND user_id NOT IN ({})""".format(question_marks) | |
sql = base_sql + " AND user_id NOT IN ({})".format(question_marks) |
ORDER BY timestamp DESC | ||
LIMIT ? | ||
) | ||
AND user_id NOT IN ({})""".format( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AND user_id NOT IN ({})""".format( | |
AND user_id NOT IN ({}) | |
""".format( |
user_id = yield self.hs.get_datastore().get_user_id_by_threepid( | ||
tp["medium"], tp["address"] | ||
) | ||
if user_id: | ||
count = count + 1 | ||
return count | ||
users = users + (user_id,) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd probably be easier to return a list rather than using tuples, really.
The bug I am fixing is quite hard to explain, but in short, in some cases the reaping background task would delete too many users from the
monthly_active_user
table if reserved users had been specified.